UTACLIR @ CLEF 2002: Towards a Unified Translation Process Model
نویسندگان
چکیده
The UTACLIR query translation system was originally designed for the CLEF 2000 and 2001 campaigns. In the two first years the query translation application consisted of separate programs based on common translation principles for the language pairs Finnish English, German English and Swedish English. The idea of UTACLIR is based on recognizing distinct source key types and processing them accordingly. The linguistic resources utilized by the framework include morphological analysis or stemming in indexing, stop word removal, normalization of topic words, splitting of compounds written together, handling of non-translated words, phrase composition of compounds in the target language, bilingual dictionaries and structured queries. This year we participated in CLEF with the new UTACLIR system, which is a single program unified for all the languages. The user gives the system the codes of source and target language as well as the query to be translated. The UTACLIR system chooses the language resources upon the codes the user has given. A morphological analyser is used to process the source language words in order to match the words in the translation dictionary. We have utilized an 18-language dictionary (with all possible language pairs) as the main translation resource of the UTACLIR. It is also possible to implement other parallel dictionaries. For the target language it is possible to use either a morphological analyser or a stemmer.
منابع مشابه
UTACLIR @ CLEF 2001: New Features for Handling Compound Words and Untranslatable Proper Names
We participated in CLEF’2001 with four automated bilingual runs. UTACLIR is an automatic query translation and construction system for cross-language information retrieval. The system automatically extracts topical information from request sentences written in one of the source languages and constructs a target language query, based on translations given by a translation dictionary. The new fea...
متن کاملUtaclir @ CLEF 2001 - Effects of Compound Splitting and N-Gram Techniques
The Tampere University CLEF research group participated in CLEF2001 with four automated bilingual runs. Our cross-lingual software, UTACLIR, uses an automated method for query construction for crosslanguage information retrieval (CLIR). This method seeks to automatically extract topical information from request sentences written in one of the source languages and to create a target language que...
متن کاملITC-irst at CLEF 2002: Using N-best Query Translations for CLIR
This paper reports on the participation of ITC-irst in the Italian monolingual retrieval track and in the bilingual English-Italian track of the Cross Language Evaluation Forum (CLEF) 2002. A crosslanguage information retrieval systems is proposed which integrates retrieval and translation scores over the set of N-best translations of the source query. Translations are computed by a statistical...
متن کاملTowards a Control Software Design Environment Using a Meta-modelling Technique
The novelty of this paper is mainly the integration of multi-disciplinary software tools into a control software design environment, namely the Integrated Design Notation (IDN). The IDN supports the design, development and implementation of decentralised distributed control systems. This new environment is based on the UML meta-model standard. The translation process to integrate a control soft...
متن کاملEXETER at CLEF 2002: Experiments with Machine Translation for Monolingual and Bilingual Retrieval
This year, the University of Exeter participated in both the CLEF 2002 monolingual and bilingual task for two languages: Italian and Spanish. We submitted 4 ranked results each for both Italian and Spanish Monolingual tasks and 5 each for the bilingual tasks. We report experimental results from our investigations of merging topic translations from two machine translation (MT) systems and recent...
متن کامل